-> root -> system -> ::system::administration
Topics related to system management, administration, standard commands and so on...
Notes on this page:

mount, read only partitions, and wierd output
[6]

If you run the "mount" command alone, you should get the list of the mounted partitions, something like:
/dev/sda1 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
Sometimes, especially when the root partition contains some errors, or has been mounted/remounted read only, the output of mount is screwd up: read-only partitions are shown as being read-write, unmounted partitions are still shown as mounted, and so on.

The problem is due to the fact that mount uses the file /etc/mtab for book keeping, writing there whenever a partition is mounted, unmounted, or remounted with different options.

If /etc/mtab is read only, for example because the root partition is read only, mount and many system tools may get confused about the status of the partitions, and may not be able to update their status. Symptoms are: partitions being unmounted without errors and still being shown by the mount command, read-only partitions shown as read-write, and all kind of inconsistencies...

To get a clean and correct view of the mounted partitions, you should take a look to the file /proc/mounts, with something like "cat /proc/mounts". That file shows the status of the partitions as seen by the kernel, and is correct most of the times.

Note, however, that on recent kernels (2.6.x and greater), every process may have a different "view" of which file systems are mounted and unmounted. If the situation gets that much confusing, one way to understand if a partition is read-write or read-only, is just to try to "touch" a simple file... have fun :)

This note is available in the following categories:

Common tcpdump options
[14]

Ok, while sniffing traffic, some options might actually be useful:

when inspecting the content of the packets...

use something like '-X -s 8192 -i eth0', where '-X' indicates to print packets both in HEX and ASCII, '-s 8192' increases the number of bytes tcpdump will actually inspect, and '-i eth0' indicates to listen on 'eth0'. Note that if you want to print the content of the whole packet, with '-s' you need to specify a value higher than the MTU of the interface. You can look at the MTU of your interface by using 'ifconfig eth0' or something like 'ip link show dev eth0'.

when checking routing/firewalling/nat problems...

use the '-e' parameter, to look at the link-level headers. Note that if we do not consider NAT, all IP packets will always have as src ip the ip address of the sending machine, and as dst ip the ip address of the final destination.

Packets that need to pass a router/gateway/firewall... will have, as dst IP, the IP address of the final destination. The packet, however, will go to the router thanks to link-level addressing, which, on ethernet, will cause the packet to have the MAC address of the router as the address of the recipient.

when looking for connectivity problems with particular networks/addresses/...

use the '-vvv' parameter, and have a careful look to all the headers printed by tcpdump. Take special care in checking ICMP packets (fragmentation requested, administratively prohibited, ...), fragmentation, the TTL, and various IP/TCP options that might be set on the packet.

Also, remember to write a filter to isolate packets coming from the network you are inspecting. Watch out, however, that certain network errors might actually come from routers and/or other IP addresses than those you are filtering, so watch out not to filter ICMP packets and not to be too strict with your filters. Something like:
 # tcpdump -n -vvv 'net xx.xx.xx.xx/24 or icmp' 
Should work as expected.

Always remember to specify the '-n' parameter. Without '-n', all IP addresses and some other numbers (mainly ports and protocols) will be transformed from their numeric value into 'names'. However, this will:
  • greatly slow tcpdump down

  • create a mess if no filter has been given, or if you are inspecting DNS packets. Without '-n' ip addresses will be transformed into hostnames. Afaik, this will require DNS packets to be sent out to your own dns 'sometimes' (depending on the resolver cache), confusing the output a lot.

This note is available in the following categories:

Setting the MTU/MSS of a given path and/or interface
[15]

Manually setting the MTU allows you to force the kernel to send smaller packets regardless of the media being used or protocols like Path MTU discovery or similar.

You can set the MTU either for a whole interface, using something like 'ifconfig eth0 mtu 200' or 'ip link set eth0 mtu 200', or for a single path with something like: 'ip route add 192.168.0.0/24 via 10.0.0.1 dev eth0 mtu 200'. In this case, you need to have the 'iproute2' package installed on your system.

You may as well change the 'advertised' MSS to ask remote ends to send you smaller or bigger packets. To change the mss for a single route, just use something like: 'ip route add 192.168.0.0/24 via 10.0.0.1 dev eth0 advmss 200' if you have the iproute2 package installed, or 'route add -net 192.168.0.0 netmask 255.255.255.0 gw 10.0.0.1 mss 200'.

This note is available in the following categories:

List of ports, and missing PIDs from netstat -ntlp
[17]

One easy way to have the list of ports open on a Linux box is to use the 'netstat -nlp' command, 'netstat -ntlp' just for TCP, or 'netstat -nulp' for just UDP, where:

-n

tells netstat not to resolve ip addresses/port numbers into hostnames/port names.

-l

tells netstat to show only sockets waiting for connections (in listen state).

-p

tells netstat to show the pid of the involved processes.

-t

tells netstat to show the pid of the involved processes.

Sometimes, however, netstat will not show the pid of a given process, and will show a '-' instead. That does not mean there is 'no process' associated with the listening socket, or that your box has been hacked as you can read on some messages on various mailing lists.

Most of the times, those ports have been opened by the linux kernel directly, and that is why 'there is no process' associated. Are you using khttpd? nfs? sun rpc? Most of the times, those ports are related to the 'portmapper' and protocols based on sun rpc. To see which ports are related to which service, supposing that rpc is the reason, you can use 'rpcinfo -p'. By running 'rpcinfo -p' on my notebook, I can see that port '2049', not bound to any process accordingly to 'netstat', is the 'nfs' server. By running 'ps aux', I can see that something like '[nfsd]' is running (kernel NFS daemon), and with 'lsmod' I can see that the 'nfsd' module is running. If I go to /proc/2839, which is the PID of the nfsd process, I can see that the 'exe' symlink does not point anywhere: a good indication that 'nfsd' is not a real program, but a component of the kernel itself.

This note is available in the following categories:

Nice general networking statistics
[18]

By using:
# netstat -s
it is possible to obtain nice-looking general purpose networking statistics, like the number of tcp connections established, failed, the number of ip packets sent, the number of packets with bad checksums, the number of segments retransmitted, and so on...

This note is available in the following categories:

Setting up raid partitions, 0xfd, and mdadm configuration file
[26]

As you can read on the Linux Software RAID HOWTO, you should set the type of raid partitions to 0xfd. Note, however, that there are two ways to assemble raid devices:

  • asking the kernel to do it automatically, at boot time.

  • by running mdadm or the raidhot tools right at boot time, telling them to assemble raid devices.

Setting the type of the partition to 0xfd is necessary only for raid devices that need to be automatically assembled by the kernel itself at boot time.

In practice: if you put 0xfd in the type flag of the partition table, the device will be automatically assembled at boot time. If you don't, you will need to configure mdadm and/or raidhot tools to do that for you (at boot time), or you will have to assemble the device manually.

To create the mdadm configuration file, you have two choices: one, create it manually, two, run
  # mdadm --examine --scan > mdadm.conf 
  or 
  # mdadm --detail --scan > mdadm.conf
  
(the two commands give the same output, sometimes in a different order). These commands require your raid devices to be active and available. If they are not active, you can either assemble them manually, and run the above commands, or use the command:
  # mdadm --examine --brief --scan --config=partitions > mdadm.conf
  
The standard location for the mdadm.conf file, on Debian systems, is /etc/mdadm/mdadm.conf.

Mounting Software RAID 1 devices individually
[27]

Ok, let's say you have a software RAID 1 /dev/md0 device made of two partitions on two different scsi disks, /dev/sda1 and /dev/sdb1.

Let's say you just had a major hardware failure, and for one reason or another, some data was corrupted on the first device, while some other data was corrupted on the second device.

One easy way to try to recover data is to mount /dev/sda1 and /dev/sdb1 individually, recover data, and then, eventually, put them back into the raid.

Doing so, is quite easy:

  • if the raid device is still mounted, umount it immediately, with something like 'umount /mount/point/of/raid/device'. You can see the mount point of the md device with something like 'mount |grep md0' or 'cat /proc/mounts |grep md0' (see note mount, read only partitions, and wierd output).

  • stop the raid device, with something like:
    # mdadm --misc --stop /dev/md0
         
    or mark it read only, with:
    # mdadm --misc --readonly /dev/md0
         

  • simply mount the devices independently, using:
     # mount -o ro /dev/sda1 /mnt/sda1
     # mount -o ro /dev/sda2 /mnt/sda2
         
    making sure the /mnt/sda1 and /mnt/sda2 directory exist.

  • You can now work on /mnt/sda1 or /mnt/sda2 without problems (...).

  • if you later decide that the content of one of the partitions is good, you can start the array in degraded mode with --assemble, or mark /dev/sda2 (for example) as failed, and then simply replace it ...

for more details, take a look to 'man mdadm'.

pvcreate on an entire disk... with partitions existing!
[28]

Ok, so... you are switching from a non-LVM system to using LVM... you have your /dev/sdb, and want to turn it in a Physical Volume to add to a simple Volume Group.

You try with a simple:
# pvcreate /dev/sdb
  
and you get an error:
Can't open /dev/sdb exclusively.  Mounted filesystem?
  
you check the mount option, with something like:
mount |grep /dev/sdb
  
and nothing appears... (well, if something appears, just run umount all mounted partitions, and try one more time). So: pvcreate fails, reporting 'Mounted filesystems?', but no file systems are mounted.

The solution is quite simple: as long as the kernel sees partitions on the device, pvcreate will not be able to lock it to create a physical volume... (this is still true on a 2.6.16 with lvm2 2.02.05). So, remove the partitions (with cfdisk/sfdisk/fdisk, whatever you want, or dd if=/dev/zero of=/dev/sdb size=512 count=1) run 'sfdisk -R' and you should be fine... run pvcreate, and you will have no more errors...
# dd if=/dev/zero of=/dev/sdb size=512 count=1
# sfdisk -R 
# pvcreate /dev/sdb
  

backing up partition table using sfdisk...
[29]

To backup a partition table, you can simply run:
# sfdisk -d /dev/sda > /home/backup.file

To restore the backup, which means, to repartition the disk as the dump you just generated, you can simply run something like:
# sfdisk /dev/sda < /home/backup.file

Note that the above commands do not save/restore the full MBR: if a boot loader or something similar is installed, its own code will be lost... you may have better luck by using dd, but watchout with extended partitions...

This note is available in the following categories:

backing up the partition table using dd
[31]

To backup the partition table using dd, simply run:
# dd if=/dev/sda of=/tmp/backup.pt bs=512 count=1
  
where /dev/sda is the device of which you want to backup the partition table, /tmp/backup.pt is the name of the file where you want your partition to be stored, and bs=512, count=1, tells dd to copy 1 sector of 512 bytes from the beginning of the disk...

This method will backup the full MBR, both the partition table and a small fragment of code used to boot your system in front of the partition table. However, IT WILL NOT BACKUP EXTENDED PARTITIONS!

This means that if you have /dev/sda5, /dev/sda6 or greater... the partition table records about those partitions will be lost. A better solution would be to save both the first sector of the disk using dd, and then the complete backup of the partition table using 'sfdisk -d' (take a look at http://notes.inscatolati.net/system[en]/storage[en]/index.html#29).

This note is available in the following categories:

Creating the partition table on many disks...
[32]

So, you have just bought 10/15 bleeding edge scsi disks, of the same type, brand, and of the same kind, and you want to create a partition table on each of them?

A simple way would just be to create a partition table on the first device (sda), with cfdisk, fdisk or whatever you like, and then backup the partition using:
# sfdisk -d /dev/sda > /tmp/backup.pt

Now, for each device, you can simply run:
# sfdisk /dev/sdb < /tmp/backup.pt
In short, something like:
# cfdisk /dev/sda
# sfdisk -d /tmp/backup.pt
# for dev in /dev/sd{b,c,d,e,f,g,h,i,j,k}; do \
>   sfdisk $dev < /tmp/backup.pt; done
should be enough...

This note is available in the following categories:

Very slow boot, near "Setting up LVM Volume Groups..."
[35]

The whole message looks something like:
     Setting up LVM Volume Groups...
     Reading all physical volumes. This may take a while...
   
followed by lot of kernel messages.

Well, that's the problem: lvscan, to find all the phyisical volumes, will just scan the first few sectors of every device on the system.

If you happen to have something particularly slow, lot of devices, or ... you are an aficionado of devfs (which will probe modules as pvscan tries to find those volumes), you will want to change /etc/lvm.conf.

Just add something like:
     filter = [ "a|/dev/md*|", "a|/dev/sd*|" ]
   
in that file, to tell pvscan to search only in /dev/md* and /dev/sd*

Logging the boot output
[45]

You have an init script failing with some weird error? The console is remote, and you need to check what's happening during the boot process? In Debian, nowdays, you can enable bootlogd to save whatever is outputted on the console during the boot process. In /etc/defaults/bootlogd, just add the line:
 BOOTLOGD_ENABLE=yes
 
Not everything is logged, but it's much better than not having it. Just check dmesg, and the file /var/log/boot.

Creating a logical volume using all the space of a volume group
[47]

Just:

  • check how many extents you have available in the volume group

  • use lvcreate with the -l option to specify the number of extents

For example:
   % vgs -o vg_name,vg_size,vg_free,vg_extent_count,vg_free_count
   VG     VSize  VFree #Ext  Free
   system 67.96G 8.96G 17397 2293
   % lvcreate -n backup -l 2293 system
   
will create a logical volume 'backup', from the volume group 'system', using up all the space available (2293 extents).

Instead of using vgs, you can use vgdisplay, and use the field:
     [...]
     Free  PE / Size       2293 / 8.96 GB
     [...]
   

This note is available in the following categories:

Creating a logical volume using all the space of a volume group
[48]

Just:

  • check how many extents you have available in the volume group

  • use lvcreate with the -l option to specify the number of extents

For example:
   % vgs -o vg_name,vg_size,vg_free,vg_extent_count,vg_free_count
   VG     VSize  VFree #Ext  Free
   system 67.96G 8.96G 17397 2293
   % lvcreate -n backup -l 2293 system
   
will create a logical volume 'backup', from the volume group 'system', using up all the space available (2293 extents).

Instead of using vgs, you can use vgdisplay, and use the field:
     [...]
     Free  PE / Size       2293 / 8.96 GB
     [...]
   

This note is available in the following categories:
Generated by CRON on 2012/02/14 at 06:26:35.